Pesquisa | Portal Regional da BVS

1.

Prevalence and determinants of subretinal drusenoid deposits in patients' first-degree relatives.

Mauschitz, Matthias M; Hochbein, Benedikt J; Klinkhammer, Hannah; Saßmannshausen, Marlene; Terheyden, Jan H; Krawitz, Peter; Finger, Robert P.

Graefes Arch Clin Exp Ophthalmol ; 262(1): 53-60, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-37672102

RESUMO

PURPOSE: Subretinal drusenoid deposits (SDDs) are distinct extracellular alteration anterior to the retinal pigment epithelium (RPE). Given their commonly uniform phenotype, a hereditary predisposition seems likely. Hence, we aim to investigate prevalence and determinants in patients' first-degree relatives. METHODS: We recruited SDD outpatients at their visits to our clinic and invited their relatives. We performed a full ophthalmic examination including spectral domain-optical coherence tomography (SD-OCT) and graded presence, disease stage of SDD as well as percentage of infrared (IR) en face area affected by SDD. Moreover, we performed genetic sequencing and calculated a polygenic risk score (PRS) for AMD. We conducted multivariable regression models to assess potential determinants of SDD and associations of SDD with PRS. RESULTS: We included 195 participants, 123 patients (mean age 81.4 ± 7.2 years) and 72 relatives (mean age 52.2 ± 14.2 years), of which 7 presented SDD, resulting in a prevalence of 9.7%. We found older age to be associated with SDD presence and area in the total cohort and a borderline association of higher body mass index (BMI) with SDD presence in the relatives. Individuals with SDD tended to have a higher PRS, which, however, was not statistically significant in the multivariable regression. CONCLUSION: Our study indicates a potential hereditary aspect of SDD and confirms the strong association with age. Based on our results, relatives of SDD patients ought to be closely monitored for retinal alterations, particularly at an older age. Further longitudinal studies with larger sample size and older relatives are needed to confirm or refute our findings.

Assuntos

Drusas Retinianas , Humanos , Idoso , Idoso de 80 Anos ou mais , Adulto , Pessoa de Meia-Idade , Drusas Retinianas/diagnóstico , Drusas Retinianas/epidemiologia , Drusas Retinianas/genética , Prevalência , Epitélio Pigmentado da Retina , Tomografia de Coerência Óptica/métodos , Angiofluoresceinografia

2.

MTHFR C677T and A1298C polymorphism's effect on risk of colorectal cancer in Lynch syndrome.

Wiik, Mariann Unhjem; Negline, Mia; Beisvåg, Vidar; Clapham, Matthew; Holliday, Elizabeth; Dueñas, Nuria; Brunet, Joan; Pineda, Marta; Bonifaci, Nuria; Aretz, Stefan; Klinkhammer, Hannah; Spier, Isabel; Perne, Claudia; Mayr, Andreas; Valle, Laura; Lubinski, Jan; Sjursen, Wenche; Scott, Rodney J; Talseth-Palmer, Bente A.

Sci Rep ; 13(1): 18783, 2023 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-37914736

RESUMO

Lynch syndrome (LS) is characterised by an increased risk of developing colorectal cancer (CRC) and other extracolonic epithelial cancers. It is caused by pathogenic germline variants in DNA mismatch repair (MMR) genes or the EPCAM gene, leading to a less functional DNA MMR system. Individuals diagnosed with LS (LS individuals) have a 10-80% lifetime risk of developing cancer. However, there is considerable variability in the age of cancer onset, which cannot be attributed to the specific MMR gene or variant alone. It is speculated that multiple genetic and environmental factors contribute to this variability, including two single nucleotide polymorphisms (SNPs) in the methylenetetrahydrofolate reductase (MTHFR) gene: C677T (rs1801133) and A1298C (rs1801131). By decreasing MTHFR activity, these SNPs theoretically reduce the silencing of DNA repair genes and increase the availability of nucleotides for DNA synthesis and repair, thereby protecting against early-onset cancer in LS. We investigated the effect of these SNPs on LS disease expression in 2,723 LS individuals from Australia, Poland, Germany, Norway and Spain. The association between age at cancer onset and SNP genotype (risk of cancer) was estimated using Cox regression adjusted for gender, country and affected MMR gene. For A1298C (rs1801131), both the AC and CC genotypes were significantly associated with a reduced risk of developing CRC compared to the AA genotype, but no association was seen for C677T (rs1801133). However, an aggregated effect of protective alleles was seen when combining the alleles from the two SNPs, especially for LS individuals carrying 1 and 2 alleles. For individuals with germline pathogenic variants in MLH1, the CC genotype of A1298C was estimated to reduce the risk of CRC significantly by 39% (HR = 0.61, 95% CI 0.42, 0.89, p = 0.011), while for individuals with pathogenic germline MSH2 variants, the AC genotype (compared to AA) was estimated to reduce the risk of CRC by 26% (HR = 0.66, 95% CI 0.53, 0.83, p = 0.01). In comparison, no association was observed for C677T (rs1801133). In conclusion, our study suggests that combining the MMR gene information with the MTHFR genotype, including the aggregated effect of protective alleles, could be useful in developing an algorithm that estimates the risk of CRC in LS individuals.

Assuntos

Neoplasias Colorretais Hereditárias sem Polipose , Neoplasias Colorretais , Humanos , Neoplasias Colorretais Hereditárias sem Polipose/genética , Neoplasias Colorretais/epidemiologia , Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Metilenotetra-Hidrofolato Redutase (NADPH2)/genética , Genótipo , Polimorfismo de Nucleotídeo Único , DNA , Predisposição Genética para Doença , Estudos de Casos e Controles

3.

Gene-based burden scores identify rare variant associations for 28 blood biomarkers.

Aldisi, Rana; Hassanin, Emadeldin; Sivalingam, Sugirthan; Buness, Andreas; Klinkhammer, Hannah; Mayr, Andreas; Fröhlich, Holger; Krawitz, Peter; Maj, Carlo.

BMC Genom Data ; 24(1): 50, 2023 09 04.

Artigo em Inglês | MEDLINE | ID: mdl-37667186

RESUMO

BACKGROUND: A relevant part of the genetic architecture of complex traits is still unknown; despite the discovery of many disease-associated common variants. Polygenic risk score (PRS) models are based on the evaluation of the additive effects attributable to common variants and have been successfully implemented to assess the genetic susceptibility for many phenotypes. In contrast, burden tests are often used to identify an enrichment of rare deleterious variants in specific genes. Both kinds of genetic contributions are typically analyzed independently. Many studies suggest that complex phenotypes are influenced by both low effect common variants and high effect rare deleterious variants. The aim of this paper is to integrate the effect of both common and rare functional variants for a more comprehensive genetic risk modeling. METHODS: We developed a framework combining gene-based scores based on the enrichment of rare functionally relevant variants with genome-wide PRS based on common variants for association analysis and prediction models. We applied our framework on UK Biobank dataset with genotyping and exome data and considered 28 blood biomarkers levels as target phenotypes. For each biomarker, an association analysis was performed on full cohort using gene-based scores (GBS). The cohort was then split into 3 subsets for PRS construction and feature selection, predictive model training, and independent evaluation, respectively. Prediction models were generated including either PRS, GBS or both (combined). RESULTS: Association analyses of the cohort were able to detect significant genes that were previously known to be associated with different biomarkers. Interestingly, the analyses also revealed heterogeneous effect sizes and directionality highlighting the complexity of the blood biomarkers regulation. However, the combined models for many biomarkers show little or no improvement in prediction accuracy compared to the PRS models. CONCLUSION: This study shows that rare variants play an important role in the genetic architecture of complex multifactorial traits such as blood biomarkers. However, while rare deleterious variants play a strong role at an individual level, our results indicate that classical common variant based PRS might be more informative to predict the genetic susceptibility at the population level.

Assuntos

Exoma , Predisposição Genética para Doença , Humanos , Predisposição Genética para Doença/genética , Biomarcadores , Fenótipo , Herança Multifatorial/genética

4.

CUX1-related neurodevelopmental disorder: deep insights into phenotype-genotype spectrum and underlying pathology.

Oppermann, Henry; Marcos-Grañeda, Elia; Weiss, Linnea A; Gurnett, Christina A; Jelsig, Anne Marie; Vineke, Susanne H; Isidor, Bertrand; Mercier, Sandra; Magnussen, Kari; Zacher, Pia; Hashim, Mona; Pagnamenta, Alistair T; Race, Simone; Srivastava, Siddharth; Frazier, Zoë; Maiwald, Robert; Pergande, Matthias; Milani, Donatella; Rinelli, Martina; Levy, Jonathan; Krey, Ilona; Fontana, Paolo; Lonardo, Fortunato; Riley, Stephanie; Kretzer, Jasmine; Rankin, Julia; Reis, Linda M; Semina, Elena V; Reuter, Miriam S; Scherer, Stephen W; Iascone, Maria; Weis, Denisa; Fagerberg, Christina R; Brasch-Andersen, Charlotte; Hansen, Lars Kjaersgaard; Kuechler, Alma; Noble, Nathan; Gardham, Alice; Tenney, Jessica; Rathore, Geetanjali; Beck-Woedl, Stefanie; Haack, Tobias B; Pavlidou, Despoina C; Atallah, Isis; Vodopiutz, Julia; Janecke, Andreas R; Hsieh, Tzung-Chien; Lesmann, Hellen; Klinkhammer, Hannah; Krawitz, Peter M.

Eur J Hum Genet ; 31(11): 1251-1260, 2023 11.

Artigo em Inglês | MEDLINE | ID: mdl-37644171

RESUMO

Heterozygous, pathogenic CUX1 variants are associated with global developmental delay or intellectual disability. This study delineates the clinical presentation in an extended cohort and investigates the molecular mechanism underlying the disorder in a Cux1+/- mouse model. Through international collaboration, we assembled the phenotypic and molecular information for 34 individuals (23 unpublished individuals). We analyze brain CUX1 expression and susceptibility to epilepsy in Cux1+/- mice. We describe 34 individuals, from which 30 were unrelated, with 26 different null and four missense variants. The leading symptoms were mild to moderate delayed speech and motor development and borderline to moderate intellectual disability. Additional symptoms were muscular hypotonia, seizures, joint laxity, and abnormalities of the forehead. In Cux1+/- mice, we found delayed growth, histologically normal brains, and increased susceptibility to seizures. In Cux1+/- brains, the expression of Cux1 transcripts was half of WT animals. Expression of CUX1 proteins was reduced, although in early postnatal animals significantly more than in adults. In summary, disease-causing CUX1 variants result in a non-syndromic phenotype of developmental delay and intellectual disability. In some individuals, this phenotype ameliorates with age, resulting in a clinical catch-up and normal IQ in adulthood. The post-transcriptional balance of CUX1 expression in the heterozygous brain at late developmental stages appears important for this favorable clinical course.

Assuntos

Deficiência Intelectual , Transtornos do Neurodesenvolvimento , Adulto , Animais , Humanos , Camundongos , Heterozigoto , Proteínas de Homeodomínio/genética , Deficiência Intelectual/genética , Deficiência Intelectual/diagnóstico , Transtornos do Neurodesenvolvimento/genética , Transtornos do Neurodesenvolvimento/patologia , Fenótipo , Proteínas Repressoras/genética , Convulsões , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo

5.

AI-based multi-PRS models outperform classical single-PRS models.

Klau, Jan Henric; Maj, Carlo; Klinkhammer, Hannah; Krawitz, Peter M; Mayr, Andreas; Hillmer, Axel M; Schumacher, Johannes; Heider, Dominik.

Front Genet ; 14: 1217860, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37441549

RESUMO

Polygenic risk scores (PRS) calculate the risk for a specific disease based on the weighted sum of associated alleles from different genetic loci in the germline estimated by regression models. Recent advances in genetics made it possible to create polygenic predictors of complex human traits, including risks for many important complex diseases, such as cancer, diabetes, or cardiovascular diseases, typically influenced by many genetic variants, each of which has a negligible effect on overall risk. In the current study, we analyzed whether adding additional PRS from other diseases to the prediction models and replacing the regressions with machine learning models can improve overall predictive performance. Results showed that multi-PRS models outperform single-PRS models significantly on different diseases. Moreover, replacing regression models with machine learning models, i.e., deep learning, can also improve overall accuracy.

6.

GestaltMatcher Database - a FAIR database for medical imaging data of rare disorders.

Lesmann, Hellen; Lyon, Gholson J; Caro, Pilar; Abdelrazek, Ibrahim M; Moosa, Shahida; Pantel, Jean Tori; Hagen, Merle Ten; Rosnev, Stanislav; Kamphans, Tom; Meiswinkel, Wolfgang; Li, Jing-Mei; Klinkhammer, Hannah; Hustinx, Alexander; Javanmardi, Behnam; Knaus, Alexej; Uwineza, Annette; Knopp, Cordula; Marchi, Elaine; Elbracht, Miriam; Mattern, Larissa; Jamra, Rami Abou; Velmans, Clara; Strehlow, Vincent; Nabil, Amira; Graziano, Claudio; Artem, Borovikov; Schnabel, Franziska; Heuft, Lara; Herrmann, Vera; Höller, Matthias; Alaaeldin, Khoshoua; Jezela-Stanek, Aleksandra; Mohamed, Amal; Lasa-Aranzasti, Amaia; Elmakkawy, Gehad; Safwat, Sylvia; Ebstein, Frédéric; Küry, Sébastien; Arlt, Annabelle; Marbach, Felix; Netzer, Christian; Kaptain, Sophia; Weiland, Hannah; Devriendt, Koen; Gripp, Karen W; Mücke, Martin; Verloes, Alain; Schaaf, Christian P; Nellåker, Christoffer; Solomon, Benjamin D.

medRxiv ; 2023 Jun 10.

Artigo em Inglês | MEDLINE | ID: mdl-37503210

RESUMO

The value of computer-assisted image analysis has been shown in several studies. The performance of tools with artificial intelligence (AI), such as GestaltMatcher, is improved with the size and diversity of the training set, but properly labeled training data is currently the biggest bottleneck in developing next-generation phenotyping (NGP) applications. Therefore, we developed GestaltMatcher Database (GMDB) - a database for machine-readable medical image data that complies with the FAIR principles and improves the openness and accessibility of scientific findings in Medical Genetics. An entry in GMDB consists of a medical image such as a portrait, X-ray, or fundoscopy, and machine-readable meta information such as a clinical feature encoded in HPO terminology or a disease-causing mutation reported in HGVS format. In the beginning, data was mainly collected by curators gathering images from the literature. Currently, clinicians and individuals recruited from patient support groups provide their previously unpublished data. For this patient-centered approach, we developed a digital consent form. GMDB is a modern publication medium for case reports that complements preprints, e.g., on medRxiv. To enable inter-cohort comparisons, we implemented a research feature in GMDB that computes the pairwise syndromic similarity between hand-picked cases. Through a community-driven effort, we compiled an image collection of over 7,533 cases with 792 disorders in GMDB. Most of the data was collected from 2,058 publications. In addition, about 1,018 frontal images of 498 previously unpublished cases were obtained. The web interface enables gene- and phenotype-centered queries or infinite scrolls in the gallery. Digital consent has led to increasing adoption of the approach by patients. The research app within GMDB was used to generate syndromic similarity matrices to characterize two novel phenotypes (CSNK2B, PSMC3). GMDB is the first FAIR database for NGP, where data are findable, accessible, interoperable, and reusable. It is a repository for medical images that cannot be included in medRxiv. That means GMDB connects clinicians with a shared interest in particular phenotypes and improves the performance of AI.

7.

Assessing the performance of European-derived cardiometabolic polygenic risk scores in South-Asians and their interplay with family history.

Hassanin, Emadeldin; Maj, Carlo; Klinkhammer, Hannah; Krawitz, Peter; May, Patrick; Bobbili, Dheeraj Reddy.

BMC Med Genomics ; 16(1): 164, 2023 07 12.

Artigo em Inglês | MEDLINE | ID: mdl-37438803

RESUMO

BACKGROUND & AIMS: We aimed to assess the performance of European-derived polygenic risk scores (PRSs) for common metabolic diseases such as coronary artery disease (CAD), obesity, and type 2 diabetes (T2D) in the South Asian (SAS) individuals in the UK Biobank. Additionally, we studied the interaction between PRS and family history (FH) in the same population. METHODS: To calculate the PRS, we used a previously published model derived from the EUR population and applied it to the individuals of SAS ancestry from the UKB study. Each PRS was adjusted according to an individual's genotype location in the principal components (PC) space to derive an ancestry adjusted PRS (aPRS). We calculated the percentiles based on aPRS and stratified individuals into three aPRS categories: low, intermediate, and high. Considering the intermediate-aPRS percentile as a reference, we compared the low and high aPRS categories and generated the odds ratio (OR) estimates. Further, we measured the combined role of aPRS and first-degree family history (FH) in the SAS population. RESULTS: The risk of developing severe obesity for SAS individuals was almost twofold higher for individuals with high aPRS than for those with intermediate aPRS, with an OR of 1.95 (95% CI = 1.71-2.23, P < 0.01). At the same time, the risk of severe obesity was lower in the low-aPRS group (OR = 0.60, CI = 0.53-0.67, P < 0.01). Results in the same direction were found in the EUR data, where the low-PRS group had an OR of 0.53 (95% CI = 0.51-0.56, P < 0.01) and the high-PRS group had an OR of 2.06 (95% CI = 2.00-2.12, P < 0.01). We observed similar results for CAD and T2D. Further, we show that SAS individuals with a familial history of CAD and T2D with high-aPRS are associated with a higher risk of these diseases, implying a greater genetic predisposition. CONCLUSION: Our findings suggest that CAD, obesity, and T2D GWAS summary statistics generated predominantly from the EUR population can be potentially used to derive aPRS in SAS individuals for risk stratification. With future GWAS recruiting more SAS participants and tailoring the PRSs towards SAS ancestry, the predictive power of PRS is likely to improve further.

Assuntos

Doença da Artéria Coronariana , Diabetes Mellitus Tipo 2 , Obesidade Mórbida , Humanos , Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 2/genética , Obesidade/genética , Fatores de Risco , Reino Unido , Povo Asiático , Herança Multifatorial

8.

Ability of a polygenic risk score to refine colorectal cancer risk in Lynch syndrome.

Dueñas, Nuria; Klinkhammer, Hannah; Bonifaci, Nuria; Spier, Isabel; Mayr, Andreas; Hassanin, Emadeldin; Diez-Villanueva, Anna; Moreno, Victor; Pineda, Marta; Maj, Carlo; Capellà, Gabriel; Aretz, Stefan; Brunet, Joan.

J Med Genet ; 60(11): 1044-1051, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-37321833

RESUMO

BACKGROUND: Polygenic risk scores (PRSs) have been used to stratify colorectal cancer (CRC) risk in the general population, whereas its role in Lynch syndrome (LS), the most common type of hereditary CRC, is still conflicting. We aimed to assess the ability of PRS to refine CRC risk prediction in European-descendant individuals with LS. METHODS: 1465 individuals with LS (557 MLH1, 517 MSH2/EPCAM, 299 MSH6 and 92 PMS2) and 5656 CRC-free population-based controls from two independent cohorts were included. A 91-SNP PRS was applied. A Cox proportional hazard regression model with 'family' as a random effect and a logistic regression analysis, followed by a meta-analysis combining both cohorts were conducted. RESULTS: Overall, we did not observe a statistically significant association between PRS and CRC risk in the entire cohort. Nevertheless, PRS was significantly associated with a slightly increased risk of CRC or advanced adenoma (AA), in those with CRC diagnosed <50 years and in individuals with multiple CRCs or AAs diagnosed <60 years. CONCLUSION: The PRS may slightly influence CRC risk in individuals with LS in particular in more extreme phenotypes such as early-onset disease. However, the study design and recruitment strategy strongly influence the results of PRS studies. A separate analysis by genes and its combination with other genetic and non-genetic risk factors will help refine its role as a risk modifier in LS.

9.

Expanding the phenotypic spectrum of NAA10-related neurodevelopmental syndrome and NAA15-related neurodevelopmental syndrome.

Lyon, Gholson J; Vedaie, Marall; Beisheim, Travis; Park, Agnes; Marchi, Elaine; Gottlieb, Leah; Hsieh, Tzung-Chien; Klinkhammer, Hannah; Sandomirsky, Katherine; Cheng, Hanyin; Starr, Lois J; Preddy, Isabelle; Tseng, Marcellus; Li, Quan; Hu, Yu; Wang, Kai; Carvalho, Ana; Martinez, Francisco; Caro-Llopis, Alfonso; Gavin, Maureen; Amble, Karen; Krawitz, Peter; Marmorstein, Ronen; Herr-Israel, Ellen.

Eur J Hum Genet ; 31(7): 824-833, 2023 07.

Artigo em Inglês | MEDLINE | ID: mdl-37130971

RESUMO

Amino-terminal (Nt-) acetylation (NTA) is a common protein modification, affecting 80% of cytosolic proteins in humans. The human essential gene, NAA10, encodes for the enzyme NAA10, which is the catalytic subunit in the N-terminal acetyltransferase A (NatA) complex, also including the accessory protein, NAA15. The full spectrum of human genetic variation in this pathway is currently unknown. Here we reveal the genetic landscape of variation in NAA10 and NAA15 in humans. Through a genotype-first approach, one clinician interviewed the parents of 56 individuals with NAA10 variants and 19 individuals with NAA15 variants, which were added to all known cases (N = 106 for NAA10 and N = 66 for NAA15). Although there is clinical overlap between the two syndromes, functional assessment demonstrates that the overall level of functioning for the probands with NAA10 variants is significantly lower than the probands with NAA15 variants. The phenotypic spectrum includes variable levels of intellectual disability, delayed milestones, autism spectrum disorder, craniofacial dysmorphology, cardiac anomalies, seizures, and visual abnormalities (including cortical visual impairment and microphthalmia). One female with the p.Arg83Cys variant and one female with an NAA15 frameshift variant both have microphthalmia. The frameshift variants located toward the C-terminal end of NAA10 have much less impact on overall functioning, whereas the females with the p.Arg83Cys missense in NAA10 have substantial impairment. The overall data are consistent with a phenotypic spectrum for these alleles, involving multiple organ systems, thus revealing the widespread effect of alterations of the NTA pathway in humans.

Assuntos

Transtorno do Espectro Autista , Deficiência Intelectual , Microftalmia , Humanos , Feminino , Síndrome , Acetiltransferase N-Terminal E/genética , Acetiltransferase N-Terminal E/metabolismo , Genótipo , Deficiência Intelectual/genética , Acetiltransferase N-Terminal A/genética , Acetiltransferase N-Terminal A/metabolismo

10.

Predicting the pathogenicity of missense variants using features derived from AlphaFold2.

Schmidt, Axel; Röner, Sebastian; Mai, Karola; Klinkhammer, Hannah; Kircher, Martin; Ludwig, Kerstin U.

Bioinformatics ; 39(5)2023 05 04.

Artigo em Inglês | MEDLINE | ID: mdl-37084271

RESUMO

MOTIVATION: Missense variants are a frequent class of variation within the coding genome, and some of them cause Mendelian diseases. Despite advances in computational prediction, classifying missense variants into pathogenic or benign remains a major challenge in the context of personalized medicine. Recently, the structure of the human proteome was derived with unprecedented accuracy using the artificial intelligence system AlphaFold2. This raises the question of whether AlphaFold2 wild-type structures can improve the accuracy of computational pathogenicity prediction for missense variants. RESULTS: To address this, we first engineered a set of features for each amino acid from these structures. We then trained a random forest to distinguish between relatively common (proxy-benign) and singleton (proxy-pathogenic) missense variants from gnomAD v3.1. This yielded a novel AlphaFold2-based pathogenicity prediction score, termed AlphScore. Important feature classes used by AlphScore are solvent accessibility, amino acid network related features, features describing the physicochemical environment, and AlphaFold2's quality parameter (predicted local distance difference test). AlphScore alone showed lower performance than existing in silico scores used for missense prediction, such as CADD or REVEL. However, when AlphScore was added to those scores, the performance increased, as measured by the approximation of deep mutational scan data, as well as the prediction of expert-curated missense variants from the ClinVar database. Overall, our data indicate that the integration of AlphaFold2-predicted structures can improve pathogenicity prediction of missense variants. AVAILABILITY AND IMPLEMENTATION: AlphScore, combinations of AlphScore with existing scores, as well as variants used for training and testing are publicly available.

Assuntos

Inteligência Artificial , Biologia Computacional , Humanos , Virulência , Mutação de Sentido Incorreto , Mutação

11.

Boosting multivariate structured additive distributional regression models.

Strömer, Annika; Klein, Nadja; Staerk, Christian; Klinkhammer, Hannah; Mayr, Andreas.

Stat Med ; 42(11): 1779-1801, 2023 05 20.

Artigo em Inglês | MEDLINE | ID: mdl-36932460

RESUMO

We develop a model-based boosting approach for multivariate distributional regression within the framework of generalized additive models for location, scale, and shape. Our approach enables the simultaneous modeling of all distribution parameters of an arbitrary parametric distribution of a multivariate response conditional on explanatory variables, while being applicable to potentially high-dimensional data. Moreover, the boosting algorithm incorporates data-driven variable selection, taking various different types of effects into account. As a special merit of our approach, it allows for modeling the association between multiple continuous or discrete outcomes through the relevant covariates. After a detailed simulation study investigating estimation and prediction performance, we demonstrate the full flexibility of our approach in three diverse biomedical applications. The first is based on high-dimensional genomic cohort data from the UK Biobank, considering a bivariate binary response (chronic ischemic heart disease and high cholesterol). Here, we are able to identify genetic variants that are informative for the association between cholesterol and heart disease. The second application considers the demand for health care in Australia with the number of consultations and the number of prescribed medications as a bivariate count response. The third application analyses two dimensions of childhood undernutrition in Nigeria as a bivariate response and we find that the correlation between the two undernutrition scores is considerably different depending on the child's age and the region the child lives in.

Assuntos

Algoritmos , Modelos Estatísticos , Criança , Humanos , Simulação por Computador , Austrália , Nigéria

12.

Clinically relevant combined effect of polygenic background, rare pathogenic germline variants, and family history on colorectal cancer incidence.

Hassanin, Emadeldin; Spier, Isabel; Bobbili, Dheeraj R; Aldisi, Rana; Klinkhammer, Hannah; David, Friederike; Dueñas, Nuria; Hüneburg, Robert; Perne, Claudia; Brunet, Joan; Capella, Gabriel; Nöthen, Markus M; Forstner, Andreas J; Mayr, Andreas; Krawitz, Peter; May, Patrick; Aretz, Stefan; Maj, Carlo.

BMC Med Genomics ; 16(1): 42, 2023 03 05.

Artigo em Inglês | MEDLINE | ID: mdl-36872334

RESUMO

BACKGROUND AND AIMS: Summarised in polygenic risk scores (PRS), the effect of common, low penetrant genetic variants associated with colorectal cancer (CRC), can be used for risk stratification. METHODS: To assess the combined impact of the PRS and other main factors on CRC risk, 163,516 individuals from the UK Biobank were stratified as follows: 1. carriers status for germline pathogenic variants (PV) in CRC susceptibility genes (APC, MLH1, MSH2, MSH6, PMS2), 2. low (< 20%), intermediate (20-80%), or high PRS (> 80%), and 3. family history (FH) of CRC. Multivariable logistic regression and Cox proportional hazards models were applied to compare odds ratios and to compute the lifetime incidence, respectively. RESULTS: Depending on the PRS, the CRC lifetime incidence for non-carriers ranges between 6 and 22%, compared to 40% and 74% for carriers. A suspicious FH is associated with a further increase of the cumulative incidence reaching 26% for non-carriers and 98% for carriers. In non-carriers without FH, but high PRS, the CRC risk is doubled, whereas a low PRS even in the context of a FH results in a decreased risk. The full model including PRS, carrier status, and FH improved the area under the curve in risk prediction (0.704). CONCLUSION: The findings demonstrate that CRC risks are strongly influenced by the PRS for both a sporadic and monogenic background. FH, PV, and common variants complementary contribute to CRC risk. The implementation of PRS in routine care will likely improve personalized risk stratification, which will in turn guide tailored preventive surveillance strategies in high, intermediate, and low risk groups.

Assuntos

Neoplasias Colorretais , Mutação em Linhagem Germinativa , Humanos , Incidência , Fatores de Risco , Células Germinativas

13.

Statistical learning for sparser fine-mapped polygenic models: The prediction of LDL-cholesterol.

Maj, Carlo; Staerk, Christian; Borisov, Oleg; Klinkhammer, Hannah; Wai Yeung, Ming; Krawitz, Peter; Mayr, Andreas.

Genet Epidemiol ; 46(8): 589-603, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-35938382

RESUMO

Polygenic risk scores quantify the individual genetic predisposition regarding a particular trait. We propose and illustrate the application of existing statistical learning methods to derive sparser models for genome-wide data with a polygenic signal. Our approach is based on three consecutive steps. First, potentially informative loci are identified by a marginal screening approach. Then, fine-mapping is independently applied for blocks of variants in linkage disequilibrium, where informative variants are retrieved by using variable selection methods including boosting with probing and stochastic searches with the Adaptive Subspace method. Finally, joint prediction models with the selected variants are derived using statistical boosting. In contrast to alternative approaches relying on univariate summary statistics from genome-wide association studies, our three-step approach enables to select and fit multivariable regression models on large-scale genotype data. Based on UK Biobank data, we develop prediction models for LDL-cholesterol as a continuous trait. Additionally, we consider a recent scalable algorithm for the Lasso. Results show that statistical learning approaches based on fine-mapping of genetic signals result in a competitive prediction performance compared to classical polygenic risk approaches, while yielding sparser risk models.

Assuntos

Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , LDL-Colesterol/genética , Modelos Genéticos , Herança Multifatorial/genética

14.

GenRisk: a tool for comprehensive genetic risk modeling.

Aldisi, Rana; Hassanin, Emadeldin; Sivalingam, Sugirthan; Buness, Andreas; Klinkhammer, Hannah; Mayr, Andreas; Fröhlich, Holger; Krawitz, Peter; Maj, Carlo.

Bioinformatics ; 38(9): 2651-2653, 2022 04 28.

Artigo em Inglês | MEDLINE | ID: mdl-35266528

RESUMO

SUMMARY: The genetic architecture of complex traits can be influenced by both many common regulatory variants with small effect sizes and rare deleterious variants in coding regions with larger effect sizes. However, the two kinds of genetic contributions are typically analyzed independently. Here, we present GenRisk, a python package for the computation and the integration of gene scores based on the burden of rare deleterious variants and common-variants-based polygenic risk scores. The derived scores can be analyzed within GenRisk to perform association tests or to derive phenotype prediction models by testing multiple classification and regression approaches. GenRisk is compatible with VCF input file formats. AVAILABILITY AND IMPLEMENTATION: GenRisk is an open source publicly available python package that can be downloaded or installed from Github (https://github.com/AldisiRana/GenRisk). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Herança Multifatorial , Software , Fenótipo , Fases de Leitura Aberta , Fatores de Risco

15.

GestaltMatcher facilitates rare disease matching using facial phenotype descriptors.

Hsieh, Tzung-Chien; Bar-Haim, Aviram; Moosa, Shahida; Ehmke, Nadja; Gripp, Karen W; Pantel, Jean Tori; Danyel, Magdalena; Mensah, Martin Atta; Horn, Denise; Rosnev, Stanislav; Fleischer, Nicole; Bonini, Guilherme; Hustinx, Alexander; Schmid, Alexander; Knaus, Alexej; Javanmardi, Behnam; Klinkhammer, Hannah; Lesmann, Hellen; Sivalingam, Sugirthan; Kamphans, Tom; Meiswinkel, Wolfgang; Ebstein, Frédéric; Krüger, Elke; Küry, Sébastien; Bézieau, Stéphane; Schmidt, Axel; Peters, Sophia; Engels, Hartmut; Mangold, Elisabeth; Kreiß, Martina; Cremer, Kirsten; Perne, Claudia; Betz, Regina C; Bender, Tim; Grundmann-Hauser, Kathrin; Haack, Tobias B; Wagner, Matias; Brunet, Theresa; Bentzen, Heidi Beate; Averdunk, Luisa; Coetzer, Kimberly Christine; Lyon, Gholson J; Spielmann, Malte; Schaaf, Christian P; Mundlos, Stefan; Nöthen, Markus M; Krawitz, Peter M.

Nat Genet ; 54(3): 349-357, 2022 03.

Artigo em Inglês | MEDLINE | ID: mdl-35145301

RESUMO

Many monogenic disorders cause a characteristic facial morphology. Artificial intelligence can support physicians in recognizing these patterns by associating facial phenotypes with the underlying syndrome through training on thousands of patient photographs. However, this 'supervised' approach means that diagnoses are only possible if the disorder was part of the training set. To improve recognition of ultra-rare disorders, we developed GestaltMatcher, an encoder for portraits that is based on a deep convolutional neural network. Photographs of 17,560 patients with 1,115 rare disorders were used to define a Clinical Face Phenotype Space, in which distances between cases define syndromic similarity. Here we show that patients can be matched to others with the same molecular diagnosis even when the disorder was not included in the training set. Together with mutation data, GestaltMatcher could not only accelerate the clinical diagnosis of patients with ultra-rare disorders and facial dysmorphism but also enable the delineation of new phenotypes.

Assuntos

Inteligência Artificial , Doenças Raras , Face , Humanos , Redes Neurais de Computação , Fenótipo , Doenças Raras/genética

16.

A statistical boosting framework for polygenic risk scores based on large-scale genotype data.

Klinkhammer, Hannah; Staerk, Christian; Maj, Carlo; Krawitz, Peter Michael; Mayr, Andreas.

Front Genet ; 13: 1076440, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36704342

RESUMO

Polygenic risk scores (PRS) evaluate the individual genetic liability to a certain trait and are expected to play an increasingly important role in clinical risk stratification. Most often, PRS are estimated based on summary statistics of univariate effects derived from genome-wide association studies. To improve the predictive performance of PRS, it is desirable to fit multivariable models directly on the genetic data. Due to the large and high-dimensional data, a direct application of existing methods is often not feasible and new efficient algorithms are required to overcome the computational burden regarding efficiency and memory demands. We develop an adapted component-wise L 2-boosting algorithm to fit genotype data from large cohort studies to continuous outcomes using linear base-learners for the genetic variants. Similar to the snpnet approach implementing lasso regression, the proposed snpboost approach iteratively works on smaller batches of variants. By restricting the set of possible base-learners in each boosting step to variants most correlated with the residuals from previous iterations, the computational efficiency can be substantially increased without losing prediction accuracy. Furthermore, for large-scale data based on various traits from the UK Biobank we show that our method yields competitive prediction accuracy and computational efficiency compared to the snpnet approach and further commonly used methods. Due to the modular structure of boosting, our framework can be further extended to construct PRS for different outcome data and effect types-we illustrate this for the prediction of binary traits.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA